Stochastic Gradient Descent with GPGPU

نویسندگان

  • David Zastrau
  • Stefan Edelkamp
چکیده

We show how to optimize a Support Vector Machine and a predictor for Collaborative Filtering with Stochastic Gradient Descent on the GPU, achieving 1.66 to 6-times accelerations compared to a CPUbased implementation. The reference implementations are the Support Vector Machine by Bottou and the BRISMF predictor from the Netflix Prices winning team. Our main idea is to create a hash function of the input data and use it to execute threads in parallel that write on different elements of the parameter vector. We also compare the iterative optimization with a batch gradient descent and an alternating least squares optimization. The predictor is tested against over a hundred million data sets which demonstrates the increasing memory management capabilities of modern GPUs. We make use of matrix as well as float compression to alleviate the memory bottleneck.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast On-line Statistical Learning on a GPGPU

On-line Machine Learning using Stochastic Gradient Descent is an inherently sequential computation. This makes it difficult to improve performance by simply employing parallel architectures. Langford et al. made a modification to the standard stochastic gradient descent approach which opens up the possibility of parallel computation. They also proved that there is no significant loss in accurac...

متن کامل

Identification of Multiple Input-multiple Output Non-linear System Cement Rotary Kiln using Stochastic Gradient-based Rough-neural Network

Because of the existing interactions among the variables of a multiple input-multiple output (MIMO) nonlinear system, its identification is a difficult task, particularly in the presence of uncertainties. Cement rotary kiln (CRK) is a MIMO nonlinear system in the cement factory with a complicated mechanism and uncertain disturbances. The identification of CRK is very important for different pur...

متن کامل

DeepSpark: Spark-Based Deep Learning Supporting Asynchronous Updates and Caffe Compatibility

The increasing complexity of deep neural networks (DNNs) has made it challenging to exploit existing large-scale data process pipelines for handling massive data and parameters involved in DNN training. Distributed computing platforms and GPGPU-based acceleration provide a mainstream solution to this computational challenge. In this paper, we propose DeepSpark, a distributed and parallel deep l...

متن کامل

Comparison of Modern Stochastic Optimization Algorithms

Gradient-based optimization methods are popular in machine learning applications. In large-scale problems, stochastic methods are preferred due to their good scaling properties. In this project, we compare the performance of four gradient-based methods; gradient descent, stochastic gradient descent, semi-stochastic gradient descent and stochastic average gradient. We consider logistic regressio...

متن کامل

Conjugate gradient neural network in prediction of clay behavior and parameters sensitivities

The use of artificial neural networks has increased in many areas of engineering. In particular, this method has been applied to many geotechnical engineering problems and demonstrated some degree of success. A review of the literature reveals that it has been used successfully in modeling soil behavior, site characterization, earth retaining structures, settlement of structures, slope stabilit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012